1 Summary

  • This study examined the heterogeneous impact of groundwater allocation policy on farmers’ groundwater use in Nebraska, and revealing which factor is the main driver to cause such a heterogeneity.

  • Results suggest that the impact is heterogeneous depending on in-season total precipitation. Farmers in regions where precipitation is low are more greatly affected by the allocation limits as compared to regions where precipitation is high.

  • This implies that the allocation limits should be adjusted based on precipitation (according to spatial gradients in precipitation), while maintaining the same the total allocation limit. This would reduce the economic damage of the regulation to farmers.



2 Data

2.1 Main Data

This study used annual well-level groundwater extraction data from two of Nebraska’s Natural Resource Districts (NRDs) belonging to the Republican River Basin: Lower Republican (LR) and Tri-Basin (TB). Metered groundwater use data over the allocation period from 2008 to 2012 was collected from the Republican River Compact Administration for irrigation wells located in LR and TB.

Each NRD has autonomy in the type of regulation used. Specifically, LR (treatment group) had an allocation of 45 acre-inches in total during the study period (9 acre-inches per year). Within the total allocation amount of 45 acre-inches, an irrigator in the LR was allowed to pump any amount of water for each year in the allocation period. Meanwhile, the majority of the TB (control group) had no allocation throughout the period.

  • NOTE: There is one small portion of the TB that has an 9 acre-inches per year allocation from 2009 to 2012, but we included it in the treatment group for our analysis.

For the exogeneity of the treatement assignment (allocation limits), we restricted our analysis to wells located within 5 miles of the NRD boundary between the LR and TB. To check the robustness of the choice of 5-mile buffer, we also conducted the analysis with 10-mile buffer. Figure 1 illustrates the spatial distribution of wells located within the 5 mile buffer. The wells colored in blue were used in the analysis.

Spatial distribution of wells located within the 5 mile buffer in the LR and TB

Figure 1: Spatial distribution of wells located within the 5 mile buffer in the LR and TB


It is well established that soil and weather conditions affect irrigation water use. Thus, we controlled controlled those factors in the regression analysis. For weather variables, we obtained total precipitation (in inch) and total reference grass evaportranspiration (in inch) for the irrigation season starting from April to September from daily weather records of gridMET database. For soil characteristics, we collected the percentage of silt, clay, as well as hydraulic conductivity (in um/s), water holding capacity (in cm/cm), and slope (in percentage) from the the Soil Survey Geographic (SSURGO) database.

2.2 Data Processing

In the LR, there are two types of pooling for irrigators to use allocated groundwater flexibly.

First type is called pooling arrangement. This is available for any single irrigator who owned multiple farmlands with Certified Irrigated Acres under common ownership. For those irrigators, it was allowed to re-allocate his combined total groundwater allocation amounts among those farmlands. Second type is called pooling agreement. This is an agreement contracted between two or more irrigators. Once the agreement was established, the parties were allowed to share their combined allocation amount.

Due to the lack of data, we could not tell which irrigators contracted pooling agreement.

Instead, we accounted for the pooling arrangement. For this purpose, we aggregated well-level data on LR to irrigator-level for each year. Specifically, for each year, if there are several data points on different wells managed under a common irrigator, those data points were aggregated into a single data point by taking area weighted average of groundwater usage and weather and soil data by irrigator. In aggregating the data, it required to distinguish irrigators who had wells only inside of the buffer and irrigators who had wells outside of the buffer as well.

Originally, our data with 5 mile buffer on the LR side contains the well data for 386 irrigators. We found that 82 out of 386 irrigators in the LR had wells outside of the buffer as well. The data on those 82 irrigators were removed from our data.

After this, Table 1 now shows the number of irrigators categorized by the number of wells owned.

The number of irrigators who owned more than one wells was 143 in total. All the data related to these irrigators were subject to the aggregation.

Meanwhile, about the data on the TB, annual well-level data was used. Consequently, the final regression data includes 5599 observations where the size of treatment and control groups are 1719 and 3880 respectively.



3 Method

This study used Causal Forest (CF) model (Athey and Imbens (2016); Wager and Athey (2018).) CF is a machine learning methods developed specifically for identifying heterogeneous treatment effects. We employed cluster-robust estimation to take into account natural clusters in the data. Each cluster consists of time-series observations of the wells located within the same county. In CF modeling, the hyperparameters (e.g., the minimum node size in each tree, the number of covariates used for node splitting, and parameters involving an honest tree-building process) are tuned internally by cross-validation. The number of trees was set at \(4000\).

The trained CF model was used to predict the treatment effects with a testing dataset where a single variable changed within a reasonable range while holding other variables fixed at their mean values. In this way, we predicted treatment effects associated with an independent variable for all the independent variables.

In an individual tree of the CF, samples are recursively split into two groups using a specific variable selected in a way that maximizes the heterogeneity in the treatment effect estimates across the resulting two groups. Therefore, the number of times a specific variable is selected as a split variable indicates the importance of the variable to estimate the heterogeneity of the treatment effect. We extracted that information to know the potential variables causing heterogeneity in treatment effect.



4 Results

Histgram of out-of-bag CATE estimates

Figure 2: Histgram of out-of-bag CATE estimates


The conditional average treatment effect (CATE) estimate was -1.338 with 95% CI [-2.569, -0.106]. This means irrigators reduced water use by about 1.3 inches per acre relative to no allocation. Figure 2 shows the histogram of out-of-bag CATE estimates. It illustrates the effectiveness of the allocation limit largely varies.


The impact of allocation limits on irrigators' water use (in inches) associated with each covariate

Figure 3: The impact of allocation limits on irrigators’ water use (in inches) associated with each covariate



Figure 3 shows the impact of each variable on treatment effect. In the figure, the histograms of each independent variable are shown in the top row, and the estimated treatment effects (in inches) are shown in the bottom row. The shaded areas indicate 95% confidence intervals (CI). Table 2 shows the ranking of important variable.

Table 2 suggested the CF used mainly three variables to estimate the treatment effects: water holding capacity (awc_r), in-season total reference grass evaportranspiration (pet_in), and in-season total precipitation (pr_in), and these variables were used to split nodes of each tree in the forest more than 60% of the times. Interestingly, most of the soil characteristics except water holding capacity were not used often to measure the treatment effect heterogeneity. Indeed, in Figure 3, the predicted treatment effects associated with those soil characteristics show relatively wide CI or does not change dramatically.

The impact of the allocation limit suddenly becomes significantly smaller in the areas where water holding capacity is more than 0.21 cm/cm. This is because such farmlands are able to hold more amount of water into the soil through supressing water runoff, and therefore they require to less amount of irrigation to satisfy soil water demand.


Two weather variables which are in-season total reference grass evaportranspiration (pet_in) and precipitation (pr_in) showed almost the same degree of importance in Table 2. But Figure 3 is suggesting that the treatment effects more varies by in-season total precipitations (in inches). For example, in areas in which in-season total precipitation was high, say 25 inches, an irrigator reduced his water usage by about 1.8 inches on average than what would be without allocation limit. Meanwhile, the areas in which in-season total precipitation was low, say 15 inches, an irrigator reduced his water usage by about 2.8 inches on average than the counterfactual outcome. This means that the current groundwater allocation design may be more binding in areas with low precipitation compared to the areas with high precipitation.


Spatial distribution of trends in seasonal precipitation in the LR from 2008 to 2012

Figure 4: Spatial distribution of trends in seasonal precipitation in the LR from 2008 to 2012


As 4 is showing, there exists a large spatial trend in seasonal total precipitation over the LR. Overall, the annual in-season total precipitation tends to be lower in the western part of the LR compared to the eastern part. Given these facts, it is suggested there is room for improvement in economic efficiency in the current design of groundwater allocation policy.

Since the results show that in the low-precipitation areas, the current groundwater allocation policy creates a greater economic loss to save the same amount of groundwater compared to high-precipitation areas. Therefore, instead, we should impose a higher allocation limit on the low-precipitation area (e.g, the eastern portion of the LR) and, we should impose a lower allocation limit on the high-precipitation areas (e.g, the western portion of the LR). By assigning spatially varying allocation limits while keeping the total allocation amount the same as the current level, it could save the same amount of groundwater with less economic damage to irrigators.

In addition, in such a spatially differentiated allocation according to the precipitation gradient, we might be able to further differentiate the allocation for each well based on the level of water holding capacity at the site. That is, considering the result showing significant heterogenous impact of water holding capacity, we could assign higher (lower) allocation limit on the farmland with low (high) water holding capacity.

4.1 Results with 10 mile buffer

Figure 5 the results from the data with 10 mile buffer. Overall, we can confirm the same trend as the results with 5 mile buffer.

The impact of allocation limits on irrigators' water use (in inches) associated with each covariate using 10 mile buffer data

Figure 5: The impact of allocation limits on irrigators’ water use (in inches) associated with each covariate using 10 mile buffer data



5 Concerns

  • This analysis assumed that all the irrigators who own more than one well implemented Pooling Arrangement. We cannot tell whether individual farmer actually used Pooling Arrangement or not. The approval by the district is required for Pooling Arrangement.

  • We cannot tell whether a farmer collaborate with some farmers to share their total groundwater allocation amount. (Pooling Agreement)

  • We cannnot identify the correct location of farmland. We know the location of wells but the location of well and the location of farmland are not necessary the same. Since all the variables are collected based on the location of wells, if well is distant from the farmland, the explanatory variables (especially soil characteristics) might be completely wrong.

  • We cannot predict the interaction effect between two or more variables. We can only predict the impact of an individual variable on the treatment effect.



6 References

Athey, Susan, and Guido Imbens. 2016. “Recursive Partitioning for Heterogeneous Causal Effects.” Proceedings of the National Academy of Sciences 113 (27): 7353–60.
Wager, Stefan, and Susan Athey. 2018. “Estimation and Inference of Heterogeneous Treatment Effects Using Random Forests.” Journal of the American Statistical Association 113 (523): 1228–42.